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ABSTRACT 

We use cosmological simulations to explore the large-scale effects of reionization. 
Since reionization is a process that involves a large dynamic range - from galaxies to 
rare bright quasars - we need to be able to cover a significant volume of the universe in 
our simulation without losing the important small scale effects from galaxies. Here we 
have taken an approach that uses clumping factors derived from small scale simulations 
to approximate the radiative transfer on the sub-cell scales. Using this technique, we 
can cover a simulation size up to 1280/i~^ Mpc with Wh^^ Mpc cells. This allows us 
to construct synthetic spectra of quasars similar to observed spectra of SDSS quasars 
at high redshifts and compare them to the observational data. These spectra can then 
be analyzed for HII region sizes, the presence of the Gunn-Peterson trough, and the 
Lyman-a forest. 

Subject headings: cosmology: theory - cosmology: large-scale structure 



1. Introduction 

The term "reionization" is used to de- 
scribe the process that turned the once neu- 
tral universe into the universe we observe to- 
day, where the neutral fraction of the gas is 
less than 10~^. Around z ~ 1100, the ex- 
pansion of the universe cooled the Cosmic 
Microwave Background to the point where 
it could no longer keep the gas ionized, and 
hydrogen recombined to a neutral fraction 
of ~ 0.9999. At this point fluctuations in 
density, stemming from quantum fluctuations 
in the early universe, were imprinted on the 
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CMB in the form of temperature fluctuations. 
These density variations continued to collapse 
after recombination and started forming the 
first gravitationally bound objects like galax- 
ies and quasars. Radiation from these first 
objects later began to reionize the universe. 
A lot of work has been done on reionization 
starting with Giroux & Shapiro in 1996 and 
continued in many papers, including but not 
limited to: Tegmark et al. 1997; Gnedin & 
Ostriker 1997; Haiman & Loeb 1998; Gnedin 
2000; Miralda-Escude, Haehnelt & Rees 2000; 
Loeb & Barkana 2001; Bruscoli, Ferrara & 
Scannapieco 2002; Haiman 2002; Lidz et al. 
2002; Hui & Haiman 2003; Whyithe & Loeb 
2003. 

Before reionization, bubbles of ionized ma- 
terial formed around objects emitting ionizing 
radiation in high density regions. These re- 
gions of ionized gas were still separated from 
each other by neutral high density gas, be- 
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cause a large number of photons are needed to 
keep high-density regions ionized. The abun- 
dance of ionizing photons increased with time 
as more objects formed, consequently creat- 
ing more ionized bubbles. The result was a 
universe containing high density regions that 
were ionized close to ionizing sources and neu- 
tral high-density regions further away from 
sources, embedded in lower density gas that 
was becoming increasingly ionized. 

Eventually the ionized regions expanded 
more and started to overlap. During the 
overlap period the mean free path of ionizing 
photons increased drastically, because more 
sources were visible in each "bubble" , so that 
all the low and average density gas was ion- 
ized quickly. The overlap period was brief and 
left small, high density pockets of neutral gas. 
The regions around quasars however, usually 
assumed to be of high-density, were highly 
ionized. The ionizing intensity still contin- 
ued to rise and the neutral regions slowly de- 
creased. This period is called post-overlap 
(Miralda-Escude, Haehnelt & Rees 2000). 

Reionization can be probed effectively with 
high-redshift quasar absorption spectra. In 
particular, a Gunn-Peterson trough (Gunn & 
Peterson 1973) should appear at redshifts near 
and above the epoch of reionization, because 
of the higher neutral fraction of distributed 
gas. Several quasars have now been observed 
at high enough redshifts {z > 6) to probe 
the epoch of reionization (Fan et al. 2002; 
Songaila & Cowie 2002; White et al. 2003; 
Songaila 2004). 

The Lyman-alpha forest observed in the 
spectra of these quasars is remarkably dis- 
similar to the classical Lyman-alpha forest at 
z between 2 and 4, or, for that matter, to 
any other type of astrophysical spectra. The 
Lyman-alpha forest at z > 5 becomes so dense 
that flux manages to penetrate only within 
individual gaps between blends of absorption 
lines. The standard cosmological theory of 



the Lyman-alpha forest cannot be applied to 
this kind of spectra, and, and the present mo- 
ment, detailed numerical simulations appear 
to be the only method to create realistically 
looking synthetic spectra of z > 4 forest. 

To be observable, high redshift quasars 
must be very bright. Because such quasars 
can ionize an appreciable region around them, 
the line of sight to them may not be typical. 
This bias can be hard to assess from observa- 
tions alone, but can be analyzed using cosmo- 
logical simulations (Mesinger & Haiman 2004; 
Mesinger, Haiman & Cen 2004). 

To compare the synthetic spectra to obser- 
vational data we need to include high lumi- 
nosity quasars in our simulation. This means 
that we need to create a model of a signifi- 
cant volume of the universe, so that enough 
objects with low volume density, such as high 
luminosity quasars, are included. Such a task 
is not trivial since we have to cover a large 
range of scale, from stars, which are thought 
to be vital in the process of reionization, to 
extremely rare quasars of luminosities in ex- 
cess of 10^^ L0. The simulation has to fol- 
low many matter components (e.g. dark mat- 
ter, gas, stars, and quasars) and also follow 
the radiative transfer in order to model such 
a complicated process (Gnedin 2000; Ciardi, 
Stoehr k White 2003; Sokasian et al. 2003; 
Sokasian, Abel & Hernquist 2001; Ciardi, Fer- 
rara & White 2003). 

In this paper we create a physical model 
of reionization in large volumes. For these 
large-scale simulations we have to develop a 
method to include the small scale structure. 
Here this is done with a formalism of "clump- 
ing factors" to approximate radiative transfer 
at scales smaller than the resolution of our 
simulation. Thus, a clumping factor model 
allows us to drastically expand the dynami- 
cal range of our simulations and thus create a 
physical model of reionization, on scales un- 
reachable by other approaches. 
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First we will describe the simulations and 
the method used to obtain the models for 
the clumping factors (Section 2.1 and Sec- 
tion 2.2). Then we will demonstrate how we 
obtained the synthetic spectra and show some 
applications they can be used for (Section 3.1 
and Section 4). In the last parts of the paper 
we will show the results obtained so far and 
finish with a discussion of future work. 

2. Approach 

2.1. Simulations 

We use the "Softened Lagrangian Hydro- 
dynamics" (SLH; Gncdin 1995) code as our 
main simulation tool. While the detailed 
methods for following gas dynamics and the 
dark matter dynamics are not important in 
this work, since we arc dealing with large, 
quasi- linear scales, the main advantage of the 
SLH code for this project is its ability to fol- 
low time-dependent and spatially-resolved ra- 
diative transfer using the Optically Thin Ed- 
dington Tensor Approximation (OTVET) of 
Gnedin & Abel (2001). We have modified the 
code to include the clumping factors in the 
ionization and thermal balance equations. 

In simulating large volumes, such as our 
256^"^ Mpc to 128/i~^ Mpc boxes, it is impor- 
tant to include various sources of ionizing ra- 
diation correctly. In this paper we split the to- 
tal source term in the radiative transfer equa- 
tion S into two components: a "smoothly dis- 
tributed" component Ss that includes sources 
that cluster on scales below our resolution, 
and an "isolated sources" component Sj that 
accounts for sources that cluster on scales 
above our resolution length. For the smoothly 
distributed component we assume that the 
source density is directly related to the matter 
density p with a power-law relation, 

Ss{x,iy,t)=g,{t)p''^{x,t), (1) 

where bs is the "bias factor" and gi, is the 



spectral shape of the smoothly distributed 
sources (which can be time-dependent). The 
isolated sources component consists of indi- 
vidual sources that follow the pre-defined lu- 
minosity function and are biased with respect 
to the matter density with a bias factor bj. 

In this paper we assume that the only type 
of isolated sources present in the universe are 
quasars. We adopt the parameterization of 
the quasar luminosity function from Schirber 
k Bullock (2003) 

"^^^^ = {L/lJb + Il/L,)if^ (2) 

where and L^, are parameters, and the faint 
and bright slopes of the luminosity function 
are 7^ = 1.6 and 75 = 2.6 respectively. Fol- 
lowing Schirber &; Bullock (2003), we param- 
eterize time dependence of and as 

= £2.72 ^ io2-80+0-8lMGpC-3, 

= 6-1-^2^1012.1-0.81(^-3)^^ (3) 

The advantage of this parameterization is 
that it satisfies all known observational con- 
straints on the quasar luminosity function and 
the only freedom remaining is encapsulated 
into the "effective emissivity parameter" e(z). 
Schirber & Bullock (2003) argued that from 
the measurements of the proximity effect and 
the Lyman-alpha forest flux decrement the 
value of e{z) Sit z ^ ?, should be about 1. In or- 
der to extrapolate e{z) to higher redshifts, we 
assume that it approximately follows the star 
formation rate from the small box simulations 
described in Gnedin (2000), and we adopt the 
following analytical fit to this dependence, 

e(2)=Cexp(-((l + z)/9)3), (4) 

where the constant C is chosen so that e(3) = 
1. 

Thus, our procedure for computing the to- 
tal source function is the following. At each 
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time step we compute the value for the crit- 
ical quasar luminosity for which there is at 
most one quasar in one resolution element. All 
dimmer quasars are counted as "smoothly dis- 
tributed sources", while all brighter quasars 
are counted as "isolated sources". We then 
compute the spectral shape of the "smoothly 
distributed sources" as the sum of the stel- 
lar contribvition and the contribution from the 
low luminosity quasars. Then individual point 
sources of ionizing radiation are distributed 
in the computational box with the luminosity 
function given by equation (2). The sources 
are distributed randomly throughout the com- 
putational box with the probability to be lo- 
cated at a given point that is proportional to 
the local mass density in hi power. For the 
simulations described in this paper we adopt 
hs = 2 and 5/ = 3. 

The isolated sources continue to emit ion- 
izing radiation for a pre-defined period of 
time, the quasar "lifetime". In this paper 
we assume that all quasars have the same 
lifetime, independent of their luminosity, al- 
though a luminosity-dependent lifetime would 
be straightforward to incorporate into our 
method. 

2.2. Clumping Factors 

Presently simulations of box sizes larger 
than about 10/i~^ Mpc do not have enough 
spatial resolution to adequately resolve the 
structure in the gas down to the smallest 
scale (the so-called "filtering" scale, Gnedin 
&: Hui 1998). Our simulation has a resolu- 
tion of 2/1^-*^ Mpc for example, which does not 
allow us to resolve features in the Ly-a for- 
est. Thus, the evolution of gas on spatial 
scales below the resolution limit must be de- 
scribed approximately with a phenomenologi- 
cal model. Such a model is often called "sub- 
cell physics". In the case of ionization evo- 
lution of low density gas in the IGM, a sub- 
cell model can be fully described by a set of 



"clumping factors" (Gnedin & Ostriker 1997; 
Madau, Haardt & Rees 1999; Ciardi et al. 
2000; Miralda-Escude et al. 2000). 

Let us consider the ionization balance 
equation for hydrogen: 



dt 



nm = -SiJnm - nniT + R{T)nenmi, (5) 



where nni, HhiI) and Uf. are number densities 
of neutral hydrogen, ionized hydrogen, and 
free electrons respectively, V is the photoion- 
ization rate, RiT) is the hydrogen recombina- 
tion coefficient, and H is the Hubble parame- 
ter. Equation (5) holds at every point in the 
IGM. Let us now impose a finite spatial scale 
on the true distribution of cosmic gas - in our 
case, the finite resolution scale of a simulation 
will be such a scale. Averaging equation (5) 
over such a scale (let us call it a "cell"), we 
obtain: 



-3iJ(nHi) - (nmr) 

+ {R{T)n,nnii) (6) 



Since numerical simulation can only deal 
with quantities defined within one cell, we 
must express the right hand side of equation 
(6) as a function of physical quantities aver- 
aged over one cell, namely 

^nm = -3-H"nHi - C/nnif + CRRhehmi, 
at 

(7) 

where we use the notation such that for any 
physical quantity / the tilde symbol repre- 
sents the average over one cell, 

/ = (/)celb 

and Cj and Cr are "clumping factors" defined 
as 



Cr 



{R(T)nenHii) 
{R{T)){ne){nnn)' 



(8) 
(9) 
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Analogously, the radiative transfer equa- 
tion for the spatially variable ionization inten- 
sity Ji, averaged over one cell can be written 
as 

- CikJ, + S„ (10) 

where k,^ is the absorption coefficient and Si, 
is the source function given by 

= S,,i + S,,s- (11) 

Here Siyj is the source function component 
from individual resolved quasars and S^^s is 
the component from the smooth background. 

In particular, it is worth noting that we 
have two clumping factors, one in the recom- 
bination term, Cr, and another one in the 
ionization term, Cj. Both clumping factors 
are necessary to properly account for the evo- 
lution in ionization state of cell-averaged hy- 
drogen number density. 

Clumping factors cannot be derived from 
the large-scale simulations that only resolve 
structures on scales above one cell. Thus, ad- 
ditional information must be used to specify 
the clumping factors and close equation (7). 

To determine the clumping factors, we use 
a simulation of reionization with the small size 
of the computational box {4h~^ Mpc) with the 
Softened Lagrangian Hydrodynamics (SLH) 
code (Gnedin 1995, Gncdin 2000). While this 
simulation has too small a box size to be use- 
ful for modeling effects of bright rare quasars, 
it has enough spatial resolution to follow the 
structure of the IGM down to the filtering 
scale, and thus, can be used for computing the 
clumping factors on scales of interest. Specif- 
ically, we split the 4h~^ Mpc box into 8 cubes 
2h^^ Mpc on a side and averaged njji, raHii, 
He, R{T), r, and their appropriate products 
over the volume of each of 8 cubes. We have 
done this for a range of redshifts from z ~ 5 to 



z ^ 12, obtaining 8 data points for each of the 
two clumping factors for each redshift value. 
Each cube gives us a different value, depend- 
ing on the ionization and the density in its 
volume. This method yields clumping factors 
that arc not dependent on redshift, but only 
on properties of the gas. We then fitted both 
clumping factors as functions of two variables, 
the neutral hydrogen fraction x = tIhi/^h, 
and the gas density p. 

Formally, averaging in equation (6) is per- 
formed over the volume of one cell. However, 
different weightings can be used in making the 
averages. For example, if we take a simple 
volume- weighted averages in equation (6), we 
will be including ionizations and recombina- 
tions not only in the low density IGM, but also 
in the high density regions within the virial- 
ized halos. Thus, both clumping factors will 
be large, but a significant fraction of ionizing 
photons will be absorbed locally, within the 
parent halo of a ionizing source, so the ioniza- 
tion and recombination terms in equation (7) 
will nearly cancel each other. This behavior 
results in numerical loss of precision, and is 
not desirable. 

Alternatively, we can weight the low den- 
sity gas more than the high density regions, 
reducing both clumping factors, but still keep- 
ing them consistent with each other. In that 
case we would exclude some of the ionizations 
and recombinations that take place in high 
density regions from equation (7). Such an 
exclusion is equivalent to counting only a frac- 
tion of ionizing photons in the ionization bal- 
ance equation, and assuming that the rest of 
ionizing photons are absorbed locally, within 
the immediate vicinity of the ionizing source. 
This fraction is commonly known as the "es- 
cape fraction", although the specific mathe- 
matical definition of the escape fraction de- 
pends on the specific prescription of how av- 
eraging is done in equation (6). 

In this paper, we consider three different 
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cases for performing averaging in equation 
(6). If the clumping factors are computed self- 
consistently and the escape fraction is chosen 
appropriately, then the results of a simulation 
should be independent of the way averaging is 
performed. Thus, comparing results of simu- 
lations with different types of averaging used 
allows us to estimate the uncertainties due to 
our final spatial resolution and inaccuracies in 
parameterizing clumping factors. 

The first case (case A) we consider is 
where the averaging in equation (6) is volume 
weighted, i.e. all local absorption counts, in- 
cluding the photons that arc absorbed close to 
the source. In this case all ionizations and re- 
combinations are counted equally, no matter 
how high the density in a cell is. The clumping 
factors for case A are large, since all photons 
are taken into accoTint. and the high density 
regions are included. The relations between 
the clumping factors for Case A can be seen in 
Figure 1. The recombination clumping factor 
is not a strong function of neutral fraction, be- 
cause it is dominated by high density regions. 
It is well approximated by a power law in den- 
sity, Cr oc p^'^. The same power law holds for 
the photoionization clumping factor, but here 
we additionally have a dependence on neu- 
tral fraction. The photoionization clumping 
factor decreases with increasing neutral frac- 
tion, showing that the photoionization rate 
and neutral fraction are anti-correlated. 

The next two cases reduce the importance 
of the high density and so exclude some of the 
ionizations and recombinations in high den- 
sity regions. 

The second model (Case B) weights the vol- 
ume by the inverse density, therefore remov- 
ing some local absorption. This increases the 
relative weight of the low density regions, so 
that high density regions do not contribute as 
much to the volume considered for the clump- 
ing factors. The clumping factors arc much 
smaller in this case, eliminating a lot of the 
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Fig. 1. — Unweighted Case (Case A): Cr/p^-^ 
versus neutral fraction {top) and Cij (?"^ ver- 
sus neutral fraction (bottom). Functional fit 
to the data is shown in gray. 



numerical problems that arise in case A. Fig- 
ure 2 shows that they are of order unity, com- 
pared to order of 10 for case A. Additionally 
the exponent in the power law dependence of 
the clumping factors on p is lower compared 
to case A. Both factors depend on the neu- 
tral fraction in a more complicated way. The 
scatter is relatively small for the recombina- 
tion clumping factor and in the low neutral 
fraction regime of the photoionization clump- 
ing factor. Figure 2 shows that Cr increases 
with increasing neutral fraction, meaning that 
the regions with lower temperature and there- 
fore higher recombination rate are more neu- 
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tral, as is expected. This effect only appears 
when the neutral fraction is low enough, corre- 
sponding to time outputs before reionization. 
Cr does not depend on the neutral fraction 
below xhi ~ 0.1, because the high density re- 
gions dominate again at high ionization frac- 
tions. 

Similarly to case A, Cj shows a decrease 
toward higher neutral fraction, but shows no 
dependence on xhii at low neutral fraction. 
This change in the slope of the data implies 
that the anti-correlation of the photoioniza- 
tion and the neutral fraction disappears when 
the HII regions are overlapping. 
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Fig. 2. — Case B: Cr/ p^''^ versus neutral frac- 
tion (top) and Ci/ p^'"^ versus neutral fraction 
{bottom). Functional fit to the data is shown 
in gray. 



In the third model (Case C) all volume el- 
ements with gas density higher than the den- 
sity at the virial radius were removed from the 
calculations of the clumping factors. Conse- 
quently, in the modeling of case C only pho- 
tons that escape the high density halos are 
used to reionize the IGM. This removes all 
local ionization and absorption from the sim- 
ulation. Similarly to case B the clumping fac- 
tors are small compared to case A. Also, most 
features in Figure 3 are similar to case B in 
the distribution of data for case C. In partic- 
ular the recombination clumping factor dis- 
tribution has the same shape as Cr for case 
B, even though its amplitude lies at about 6 
instead of 2 in case B. 

On the other hand the photoionization 
clumping factor is best fit by a differently 
shaped function that remains nearly constant 
at low neutral fraction. The data for case C at 
higher neutral fraction closely resembles case 
B. These distributions shows the similarity 
of the behavior of their clumping factors with 
respect to changes in neutral fraction and den- 
sity for the two density weighted models. As 
mentioned above all three models should be 
consistent with each other when the appropri- 
ate effective escape fractions are set, but the 
data shows that model B and model C are also 
similar in the details of their approximations. 

Case B appears to be the model most 
closely resembling the true physical situation, 
because it does not completely ignore the pho- 
tons coming from high density regions, and 
also considers that many of them should get 
absorbed on their way out. There is signif- 
icant scatter of the data around the fit, be- 
cause clumping factors cannot be represented 
as simple functions of gas density and ioniza- 
tion state, but also depend on other proper- 
ties like a distributions of sources, ionization 
history, clustering of sources, etc. We cannot 
represent this scatter of the data by simple 
averages of density and neutral fraction, since 
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Fig. 3. — Case C: Cr/p^'^^ versus neutral 
fraction {top) and Cj versus neutral fraction 
{bottom). Functional fit to the data shown in 
gray 

it is caused by variations on smaller scales. In 
the future, statistical treatment of the data 
could help to improve the fitting and take into 
account the stochasticity in the local proper- 
ties of the IGM, but this is outside the scope 
of this paper. 

To sum up, using these models of "sub-cell 
physics" allow us to increase the cell size of 
the simulation from kpc to Mpc scales while 
approximately retaining the effect of physical 
processes at small scales. 



2.3. Escape fraction 

Generally the phrase "escape fraction" is 
understood as the fraction of ionizing photons 
that leave a high density region surrounding a 
source without being absorbed locally. We de- 
fine an effective escape fraction fggc that mea- 
sures the amount of ionizing photons emitted 
from homogeneously distributed sources (see 
equation 11). Dropping the subscript ly the 
smooth component of the source function for 
each frequency can be described with: 



Ss = fe 



with 



and 



5(0) 



O.le(z) 



Mr, 







yr Mpc' 



(12) 



(13) 



(14) 



where Sg'\ is the contribution from stars in 
galaxies, Ss,u is the contribution to the source 
function from unresolved quasars, and /esc is 
the effective escape fraction. Also, is the 
star formation rate described by the gas den- 
sity pgas and the efficiency e. Ss,u is usually 
negligible compared to the stellar component, 
but has to be included for completeness. Both 
components are given by the star formation 
rate and thus arc the same in all simulations, 
making f^gc the only free parameter. 

Equation 12 shows that any change in fgsc 
will change the amount of photons present 
outside the high density regions and thus the 
average ionizing flux transmitted through the 
box. This in turn changes the amount of neu- 
tral hydrogen present. Thus, decreasing the 
escape fraction results in a lower number of 
photons available for ionizing the IGM and 
influences the time of reionization. In our 
simulations, we treat the escape fraction as 
a phenomenological free parameter, which we 
adjust to fit the mean transmitted flux ob- 
served in the spectra of SDSS quasars (See 
Section 5). 
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For each of the three cases of clumping 
factors mentioned above, we have to adjust 
the effective escape fraction, since each case 
treats the photons emitted and absorbed in 
high density regions differently. First we have 
Case A, where we include all the local absorp- 
tion in calculating the clumping factors, and 
secondly case B, where we weight the volume 
by the inverse density, increasing the influ- 
ence of the lower density regions. Thus, both 
cases include the high density regions and the 
photons absorbed there. On the other hand 
case C does not include the high density re- 
gions and therefore less photons are used in 
the reionization process. 

In general when escape fractions are cal- 
culated, only photons that actually escape 
from the source are counted. Considering that 
these sources are commonly positioned in high 
density regions in the IGM , this more general 
definition of escape fraction best corresponds 
to case C. 

Since the effective escape fraction describes 
the amount of photons that escape compared 
to the amount of photons produced, fesc 
should be unity for case A, if the star for- 
mation rate used for the simulation is correct, 
since all photons are counted. Our model 
given by equation 4, however, does not calcu- 
late the star formation rate perfectly realisti- 
cally. It is expected that the effective escape 
fraction will be larger if the star formation 
rate is underestimated to correct for the lack 
of photons. Additionally, errors in the cal- 
culation of Cj and Cr due to scatter in the 
simulated data (see Section 2.2) change the 
value that is required for f^sc to yield data in 
agreement with observations. This problem is 
compounded by the limited resolution of the 
small-box simulation used to develop the fit 
for the clumping factors. 



3. Spectra 

3.1. Lines of Sight 

The main purpose of our large scale simula- 
tions is to produce synthetic absorption spec- 
tra, which can be compared directly to actual 
spectra. First we determine the positions of 
the brightest quasars present in the simula- 
tion box at the chosen redshift to get starting 
points for the lines of sight. Beginning the 
spectrum at precisely the source's position is 
important, because only then does the HII re- 
gion surrounding it appear in the spectrum. 
This will allow us to compare our synthetic 
spectra to absorption spectra of high-redshift 
quasars where the source is used as the back- 
ground lighting and derive properties of the 
HII regions surrounding the bright quasars. 
The spectrum will not display the increase in 
transmitted flux expected for an HII region at 
the high-z end of the spectrum if it starts at 
a different position. 

After determining the starting point, we 
cast a ray from the quasar position through 
the box by setting a random direction and fol- 
lowing it on a straight line through the box. 
To get additional resolution and a smoother 
spectrum, we cast the ray with a step size one 
quarter of the cell size. The values of density, 
neutral fraction and temperature are output 
at each step by using a weighted average of 
their values at the 8 closest grid-points. Be- 
cause many spectra sampling different direc- 
tions are necessary for statistical analysis of 
the HII regions and IGM properties, we cre- 
ated three to five spectra for each of the 50 
brightest objects in the simulation box. 

Because of the large distances along the 
ray, the universe expands measurably while an 
imaginary photon of our ray of light crosses 
the computational box. Thus, we have to 
take into account the expansion of the uni- 
verse along the spectrum. Accordingly, we not 
only have to cast the line of sight in space, but 
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also have to consider the time evolution, using 
consecutive output files of the simulation. 

The spectra so obtained represent random 
samples of the universe between z = 6.5, cor- 
responding to 9120A for Lyman-a emission, 
and z = 4.0. The starting points for the 
different sets of spectra were picked to be 
the redshifts of the most distant quasars ob- 
served and also included some lower redshifts 
to determine the evolution of HII region sizes 
(starting points between z = 6.5 and z = 5.9). 
The comoving distance between z = 6.5 and 
z = 4.0 is Sdlh"^ Mpc, so depending on the 
direction of the line of sight and the position 
of the quasar a spectrum can cover a distance 
too long for one box size. In this case the 
spectrum wraps around the periodic box to 
cover the wavelength range from before to af- 
ter reionization. The direction of the ray is 
chosen randomly to get around the problem 
of the ray passing repeatedly through exactly 
the same region of the simulation. This stack- 
ing of boxes is similar to the method used by 
Mesinger et al. (2004) and Cen et al. (1994). 

3.2. Small Scale Structure 

Prom the simulation output alone we can 
only extract information aboTit large scale 
variations in p, nni, nnii and T, because the 
resolution of the simulation is limited. High- 
resolution spectra from instruments such as 
Keck, can reveal information about more neu- 
tral gas in between the quasar and the ob- 
server on small scales using the Lyman-a for- 
est lines. For example the spectrum (Figure 4) 
of quasar SDSS J1030 at z = 6.28 has a res- 
olution of S\ = 1.63 A whereas our resolution 
from the simulation is about (5A = 9 A. 

Typically, normal sized galaxies or Lyman 
limit systems show Lyman-a features on a few 
A scale, smaller than the resolution of the 
simulation. To allow comparison to real obser- 
vations we add, a posteriori, small scale struc- 
ture due to density fluctuations below the res- 



olution limit. 

In order to generate the small scale struc- 
ture along the simulated line of sight, we com- 
pute the ID matter power spectrum as 

POD 

PiD{k) = TT / qP3D{q)dq, (15) 
J h 

where Pm (k) and P^d (q) are one-dimensional 
and three-dimensional power spectra respec- 
tively. A similar equation can be written for 
the velocity fluctuations, which are needed to 
take into account the effects of Doppler shifts 
from peculiar motion. To get a resolution in 
wavelength space fine enough to be able to 
compare the flux to observational spectra the 
number of steps n were chosen to be 2^^. We 
then apply a Fast Fourier Transform to gener- 
ate a real-space representation of linear over- 
density Sl and velocity vl along a line of sight 
on a uniform spatial mesh with mesh spacing 
Ax = Wh~^ kpc (small enough so that ther- 
mal broadening of the synthetic spectra will 
alleviate the dependence on this parameter). 

We need to modulate the small scale struc- 
ture using the large scale variations directly 
obtained from the simulation in density Psimi 
temperature Tgim, and neutral hydrogen den- 
sity nHi,sim along the simulated line of sight 
to obtain the small-scale distributions. This 
combines the data obtained from the simu- 
lation with a resolution on scales of a few 
h"^ Mpc with the fluctuations implied by the 
power spectrum on much smaller scales. 

We transform the small scale linear fluctu- 
ations 6l along a line of sight using the log- 
normal model (Shandarin et al. 1995; Bi & 
Davidsen 1997) to get 

2 

Plos{l) = Psime^'-'^ , (16) 

and then use this pios{l) to obtain the dis- 
tributions for temperature and neutral hydro- 
gen density: 

Tlos{l)=Tsim f^V'' (17) 
\ Psim J 
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nHi,ios{l) = nni. 



Tio 



0.7 



Plo 



,(18) 



where p is the mass density, T is the temper- 
ature, riHi is the neutral fraction of hydro- 
gen and ^i:, is a small-scale over-density. The 
subscript "los" refers to small-scale structure 
along the line of sight and the subscript "sim" 
again refers to values obtained directly from 
the simulation (low spatial resolution). 

The last two equations assume ionization 
equilibrium, which is a relatively good ap- 
proximation after the overlap of HII regions 
(Gnedin 2000), and a density-temperature re- 
lation in the form T ~ p^'^, which is a rea- 
sonable approximation for 5 < z < 6 redshift 
interval (Hui k Gnedin 1998, Gnedin 2001). 

The resulting distributions of density, neu- 
tral fraction, temperature and velocity, yield 
a synthetic absorption flux along the line of 
sight that includes the small scale fluctuations 
resulting in a synthetic spectrum that repre- 
sents both large and small scale structure. 
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Fig. 4.— Spectrum SDSSJ1030-h0524 (top) 
and a synthetic spectrum including noise and 
resolution (bottom). Absorption by the com- 
plete Lyman series was included. 



4. Quasars and HII Regions 

Quasars are the brightest objects known 
at many wavelengths and thus they can be 
observed at high redshifts. The brightest 
quasars have a bolometric luminosity at z=6 
of around 1O^^'^L0. The emitted radiation is 
mainly in the UV, and can ionize a large re- 
gion of gas surrounding the quasar, called the 
HII region. 

In general the size of the HII region de- 
pends on both density and dumpiness of the 
gas and also the abundance of ionizing pho- 
tons. This means that in addition to density 
and dumpiness, the age and luminosity of the 
source plays an important role in the process 
of forming the HII region and determine its 
spatial growth. 

Analyzing the distribution of the HII re- 
gion sizes and how they evolve with time can 
give us information about the characteristics 
of quasars and the properties of the gas sur- 
rounding them. In this paper we use the lines 
of sight described in Section 3.1 to determine 
the size of the HII regions. The edge of the HII 
region around a quasar should be detectable 
as a dear increase in Ly-a absorption just red- 
ward of the quasars Ly-cc emission line. 

The fact that the neutral fraction increases 
further away from the quasar due to the dilu- 
tion of photons can be used to determine the 
size of the HII region. It means that there 
will be more neutral gas further away from 
the quasar, so the ionizing radiation cannot 
escape and there is a decrease in transmitted 
flux. We define the edge of the HII region to 
be the minimum of the flux, before it increases 
again due to general reionization of the uni- 
verse. To be able to measure this HII region in 
the simulation, the quasar needs to be bright 
enough to ionize an area large enough around 
itself so that the resolution of the simulation 
is not larger than the HII region. 

Additional problems in detecting the HII 
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regions and determining their sizes are caused 
by the fact that the HII regions are not nec- 
essarily spherical and also that the quasars 
do not have to be positioned at their cen- 
ter. The irregular shape of the HII regions 
is mainly due to the "lumpincss" of the gas, 
which means that less dense parts of the IGM^ 
are ionized faster than other more dense ones. 

There are of course many different param- 
eters that can influence the size distribution, 
only some of which can be understood and 
modeled. For example, larger regions can be 
caused by lower density in a relatively uniform 
IGM, higher luminosity of the quasars or life- 
times long enough for the quasars to reach 
equilibrium. 

The synthetic spectra can serve as the ba- 
sis of our analysis of the effects of the different 
clumping factor models and the simulations of 
reionization. The following sections will show 
how we used these lines of sight to obtain dis- 
tributions of HII region sizes and other char- 
acteristics of the spectra. 

5. Results 

5.1. The mean transmitted flux 

Our primary goal is to reproduce the spec- 
tra of high redshift SDSS quasars as closely 
as possible. To accomplish this, the first step 
was to create a simulation of reionization cov- 
ering a large volume and compare it to ob- 
servations taken at high redshifts using SDSS 
data to better reproduce the mean transmit- 
ted flux. We ran a large number of simulations 
with varying f^sc and clumping factor mod- 
els to obtain the best fit to the SDSS data 
possible. These trial runs constrained /esc for 
each clumping factor model. Then this set 
of runs was used to test our approach and to 
determine whether our analysis with synthe- 

^The asymmetric shapes can clearly be seen in Figure 9 
for illustration of this phenomenon. 



sized spectra was appropriate. Thus, we ob- 
tained several 64^ cell test runs with either 
Wh-^ Mpc or 2h-^ Mpc cell-size. 

The production runs have 128^ cells with 
again either lOh"^ Mpc or 2h~^ Mpc cell-size 
for all three clumping factor models. In ad- 
dition we have runs for varying quasar life- 
time also with 128^ cells and 2h~^ Mpc cell- 
size, since this combination of simulation 
parameters gives us good resolution and a 
large enough volume to contain bright enough 
quasars. 

Table 1 summarizes a few of the best fit 
runs with all their initial parameters. Fig- 
ure 5 compares, for each of these runs, the 
mean transmitted flux averaged in redshift 
bins Az = 0.1 from z = 4 to 6, to that 
measured by Songaila et. al (2004) from 
SDSS quasars. In all cases shown in Fig- 
ure 5, the mean transmitted flux gives a rea- 
sonable fit to the SDSS data. This means 
that for each model, the escape fraction can 
be adjusted to achieve consistency with the 
mean transmitted flux measured from the 
SDSS data. Further analysis, discussed in 
Section 5.3 and 5.5), reveals some difference 
between the cases, such as in the distribution 
of HII sizes and of troughs in the spectra. 

When comparing the different cases, we ob- 
serve, as expected, that Case A has the largest 
fesc- This is because wc take into account all 
the photons in the high density regions. The 
best flt values for Case B are the smallest, 
only around a few percent (Section 5). The 
significant difference in the values for f^sc for 
the various clumping factor models show how 
different the methods of counting photons are 
and how they affect the amount of ionizing 
flux, but the fact that we can fit all three mod- 
els within reasonable error to the SDSS data 
shows that fesc is a convenient phenomenolog- 
ical parameter and not the same as the obser- 
vationally measured quantity. 

Fitting the effective escape fraction to get 
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the correct shape in a flux versus redshift plot 
requires several simulation runs with varying 
escape fraction (see Section 5.1) to close in on 
the correct fesc- We mainly fit the average 
transmitted flux between z = 5 and z = 6, 
because of the larger errors on the SDSS data 
at higher redshifts and our focus on simulating 
the period of reionization. 

In addition to the three different cases 
for the clumping factors, the best-fit values 
for the effective escape fraction also depend 
weakly on numerical resolution and box size. 
These effects are usually not large, but need 
to be taken into account to obtain the best 
fits to the data. 

Figure 5 shows that the case of 10^ yr 
quasar lifetime is barely distinguishable from 
that of infinite lifetime. In effect, lO'^yrs is 
tantamount to infinite lifetime. The best fit 
escape fractions are similar: fesc = 0.065 for 
10^ yrs, versus fesc = 0.07 for infinite lifetime. 
The effective escape fraction for the shorter 
lifetime is slightly smaller, because, for a given 
luminosity function, quasars with shorter life- 
times are less biased. 

The normalized transmitted flux of all the 
runs shown in Figure 5 tends to 0.2 at z = 4.5 
at the low redshift end, but for the high- 
redshift end there is a larger variation in the 
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Fig. 5. — Average flux versus redshift for the 
SDSS data and some of the best fits: SDSS 
data from White et al.{black individual trian- 
gles), Run 3 {black long dashed), Run 4 {gray 
dotted), Run 5 {black short dashed), Run 6 
{gray dash dot dot dot). Run 7 {gray dashed). 
Run 8 {gray dash-dot). Run 9 {black dots). 
Run 10 {black dash dot). Run 11 {black dash 
dot dot dot). Notice how similar all models 
and resolutions appear when fitted with fesc- 

flux between the different runs. The data ob- 
tained from SDSS at redshifts z > 5 have a 
larger error because there are fewer high red- 
shift quasar observations, so that our results 
lie within the error for this regime. The gen- 
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eral trend of the SDSS data is fit well, though 
the observed "bump" in intensity around z = 
5.5 does not appear clearly in any of our mod- 
els. 

Figure 5 illustrates that the model for the 
clumping factors only weakly effects the evo- 
lution of the IGM during reionization. One of 
the slight deviations can be seen for Case C, 
where the high density regions are excluded 
from the calculation of the clumping factors. 
The mean transmitted flux curve shows a 
somewhat more rapid ionization around z = 
5.5 to z = 6.0. The case C models follow 
the curve set by the SDSS data more closely, 
but at lower redshifts they tend to lower neu- 
tral fraction than case B models. Figure 5 
emphasizes that independent of the choice of 
cell-size, box size and model, there is a fesc 
that fits the transmitted flux of the simula- 
tion run to the SDSS data. 

Figure 6 shows two sets of the same data 
analyzed independently by White et al. 2003 
and Songaila et al. 2004. For clarity only 
the error bars on the White et al. set are dis- 
played. Also shown is a mean transmitted flux 
curve from one of our simulation and the la 
deviation of the flux from this mean in the var- 
ious lines of sight. The fit clearly lies within 
the error of the observational data. The vari- 
ations in mean transmitted flux from the sim- 
ulation increase toward higher redshift, indi- 
cating the spread in the timing of reionization 
along different lines of sight. 

For a clearer illustration of how reioniza- 
tion proceeds in our simulation, Figures 7 
shows the evolution with redshift of the aver- 
aged photoionization rate F in different sim- 
ulations. For these figures, the photoioniza- 
tion rate F of the gas was averaged over the 
entire simulation box instead of along a line 
of sight. In all the different cases shown the 
rapid increase of F at z ~ 7 is clearly visible, 
also the rate of increase of F versus redshift 
is lower than predicted by small box simula- 
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Fig. 6. — Mean transmitted flux for two differ- 
ent analyses of the same set of SDSS quasars: 
Songaila et al. (2004) {black triangles) and 
White et al. (2003) {open squares) compared 
to the mean transmitted flux calculated for 
run 10 (which we choose just for illustra- 
tion purposes; other models have very sim- 
ilar variations in the mean transmitted flux 
with redshift), a Case C 128^ simulation with 
2/i-iMpc cell -size {black line). Shown also 
are the lines corresponding to la change at 
each redshift for the synthetic spectra {dash- 
dot- dot- dot). 

tions (Gnedin 2000, 2004). This corresponds 
to a drop in neutral gas which is spread over 
a range in redshift of only Az ~ 1. 

Figure 7 illustrates how box and cell size 
affect the averaged photoionization rate F for 
case B. The graph indicates a spread of about 
a factor of three while the ionization rate in- 
creases by ~ 10^ as the redshift varies from 
z ~ 9 to 4. The main systematic difference 
appears to be that the photoionization rate 
is on the whole (except during reionization) 
lower in the lower resolution simulations, with 
lO/i-i Mpccell -size. This difference can be at- 
tributed to the difference in clumping factors, 
and can be interpreted as an indication of the 
uncertainty arising from the use of the clump- 
ing factor formalism. 
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Fig. 7. — Comparison of the averaged pho- 
toionization rate versus redshift for case B: 
Run 5{solid), Run 6 (dash-dot), Run 4 [dash- 
dot- dot), and Run 3 (dashed). 

5.2. Examples of Synthetic Spectra 

As described in Section 3.1, we use syn- 
thetic spectra to ahow comparison between 
the results of our simulations and observa- 
tions. Small scale variations are added to 
the synthetic spectra created from the simula- 
tions as described in Section 3.1. Thus regions 
with low flux correspond to regions where the 
mean free path is too small for any significant 
amount of radiation to be transmitted. We 
also observe spikes in the spectra, that origi- 
nate in the lower density regions. In the fol- 
lowing paragraphs we compare the large-scale 
variations with the synthetic spectrafrom the 
convolution of large and small scales. 

Figure 8 displays both the spectrum with 
the small scale flux added in and the large 
scale fluctuations directly from the simulation 
over the complete wavelength range (top) and 
near a bright quasar at z = 6.5 (bottom) and 
thus illustrates some of the properties of the 
synthetic spectra and their implications. The 
HII region around the quasar at the bottom 
of Figure 8 is indicated by the large scale flux 
distribution (gray). The limited resolution of 



the simulation, A = 8.5 A, smooths the flux 
distribution so that the HII region is clearly 
delineated. Note that using the small scale 
distribution of flux to determine the HII re- 
gion size would be more difficult, because the 
edge of the HII region is not clearly defined. 
The resolution of the spectrum with the small 
scale structure is taken to be that of the Keck 
high-resolution instrument. The Lyman-a- 
transmission is relatively sparse, which shows 
that the IGM is not ionized enough at this 
epoch to be transparent. This does not imply 
that the IGM is neutral, because even a small 
fraction of neutral hydrogen increases the op- 
tical depth drastically. 

The top panel of Figure 8 shows another 
synthetic spectrum of a quasar at z = 6.49, 
which is about the same distance as the fur- 
thest quasar observed so far, over the whole 
wavelength range from 9100 A to 6000 A. 
This plot shows how the large-scale neutral 
fraction decreases with shorter wavelengths 
because the universe is more ionized (Miralda- 
Escude et al., 2000) so that the average flux 
increases around 7500 A or z ~ 5.2, as can be 
seen in the large scale flux distribution and 
has been observed in SDSS spectra. Also, 
this figure shows how the lines thin out to- 
ward higher redshifts and the overall average 
flux decreases, similarly to the Gunn- Peterson 
trough found in observational spectra. 

The middle panel of Figure 8 shows the 
same spectrum as in the top panel. Here we 
zoom into some of the large gap regions cen- 
tered at 8550 A and its surroundings to show 
the sparse transmitted flux in more detail. 
Gaps this large at a relatively low redshift 
should correspond to large-scale overdensitis 
in the IGM (see also Section 5.5). It is also 
notable how dense the Lyman-a forest is at 
these redshifts. 

The bottom panel of Figure 8 is a zoom 
to the high redshift end of the spectrum. It 
shows the transmitted flux from the HII re- 
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gion of the quasar both on small scales {black 
line) and smoothed to a resolution of 10 A for 
calculating the HII region size (dash-dot). 
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Fig. 8. — The top panel shows a synthetic 
spectrum of a high-redshift quasar at z = 6.49 
with the flux distribution from the small scale 
structure shown in black and the flux from 
the large scale structure shown in gray. The 
middle panel shows a zoom to the troughs 
in the same spectrum. Note the dense ab- 
sorption forest at these wavelengths and large 
troughs with no transmitted flux. The bot- 
tom panel shows the same spectrum as above 
but zoomed to the region near the quasar 
at z = 6.49, the black line shows the flux 
with small scale structure, the dash dotted line 
shows the flux smoothed over 10 A to illus- 
trate the measurement of an HII region size. 



5.3. Analysis of Simulation for HII Re- 
gions Sizes 

A bright quasar can photoionize a large re- 
gion around it, its HII region. The HII region 
is detectable in spectra as an increase in trans- 
mitted Lyman-a flux to the red side of the 
quasar's Lynian-a emission line. Our spectra 
can be used to obtain properties of HII regions 
and then compare them to observed proper- 
ties. To analyze the sizes of the HII regions, 
we calculate several characteristics of the HII 
regions STirrounding a quasar and how they 
change with time. Some of these characteris- 
tics are also measurable in observed high res- 
olution spectra: e.g. the maximum flux, the 
distance at which the flux is minimal and the 
area under the flux versus wavelength curve. 
These values give a measure of the size of the 
HII region, and can be compared to observed 
data. Since flux increases on average with 
decreasing redshift due to the ionization of 
the IGM, the flux minimum should be a good 
measure of the boundary of the HII region. 

Small scale fluctuations and low spatial res- 
olution can prevent us from flnding the ac- 
tual minimum. Also our simulation does not 
produce spherical HII regions, as can be seen 
in in the upper two panels of Figure 9, caus- 
ing several different values for the size of an 
HII region depending on the direction of the 
line of sight cast. The images in Figure 9 
show two cross-sections of the simulation box 
gray-scale coded by the neutral fraction from 
xhi = 10~^ in black to xhi = 10~^ in white 
{xhi = ^^)- The top cross-section lies in the 
x-y plane, while the bottom one shows a z-y 
projection. The slices are positioned at the 
position of a luminous quasar in the simula- 
tion and shows the large HII regions (black) 
surrounding it as cross-section through the 
simulation box. 

In the top image of Figure 9, one can clearly 
see a relatively spherical HII region surround- 
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ing a bright quasar. Near this large HII re- 
gion we can observe other smaher highly ion- 
ized regions in dark gray, corresponding to 
xhi < 10^^. These regions correspond to 
fainter quasars and their HII regions. In the 
lower right part of the image is an example of 
a clearly non-spherical HII region around an- 
other very bright quasar. It is more difficult 
to obtain a reliable value for the size of the 
HII region because it is so irregularly shaped. 
Since most of our quasars produce ionized re- 
gions that are non-spherical, we cast several 
lines of sight from each quasar to obtain a sta- 
tistically relevant sample. 

The bottom panel of Figure 9 shows a per- 
pendicular slice of the same simulation. Here 
the complicated structure of the HII region is 
even more pronounced with features that re- 
semble a plume. There are also a few white 
spots in the image, which correspond to highly 
neutral regions in the simulation. These spots 
would show up as regions in spectra with no 
transmitted flux. 

Figure 10 gives a closer view of the same 
HII region. The black area is the highly 
ionized region and has a diameter of about 
8h~^ Mpc, however the ionized region extends 
much further (shown in darker gray) . As men- 
tioned above, we can see the irregular, elon- 
gated shape of the ionized region, surrounded 
by almost completely neutral gas on either 
side. This high density gas has not yet been 
fully ionized and confines the HII region to a 
"tunnel". The ionizing radiation can escape 
from the quasar only through this tunnel, be- 
cause the mean free path is much lower in the 
orthogonal direction. We can again see sev- 
eral areas with low neutral fraction around 
the main HII region. Most of these spots 
that have a neutral fraction of about 10~'^'^ 
are clustered together with other more ionized 
"blobs" shown in lighter gray. 

At this redshift (z = 6.28) the universe 
is already very close to fully ionized, so the 
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Fig. 9. — Top panel: This x-y slice through 
the simulation box shows a view of the neu- 
tral fraction distribution in a 128'^ simulation 
with 2h~^Mpc cell-size at z = 6.28. Several 
luminous quasars and their extended HII re- 
gions can be see as darker spots. For example 
the irregularly shaped dark region in the lower 
left corresponds to a HII region of a luminous 
quasar surrounded by dense more neutral gas 
(shown in light gray). Bottom panel: A dif- 
ferent view (z-y plane) of the simulation, with 
the same bright quasar in the bottom right 
corner. Note the asymmetry in the shape of 
the HII region {dark gray). 

dark gray areas that have a neutral fraction 
of 10~^ — 10~^ are the most neutral regions in 
the simulation at this redshift. However this 



17 




Fig. 10. — A close-up view of the HII region 

of the bright quasar in the simulation from 
Figure 9. Again, the asymmetry in the shape 
of the HII region is clearly visible( darfc gray) 

does not imply that the spectra will have a 
large transmitted flux, because even a neutral 
fraction as low as 10~^ is sufficient to absorb 
all flux from the quasar. 

5.4. Properties of HII Regions 

The sizes of HII regions measured from the 
extent in wavelength of the region of increased 
flux offers a point of comparison between sim- 
ulations and observations. 

The simulations show that a quasar's 
HII region is typically neither spherical nor 
sharply defined, so the notion of its size is in- 
evitably somewhat hazy. We define the size of 
an HII region for our synthetic spectra as the 
first minimum with F < 0.01 or F < 0.03. 
The flux cutoff is calculated by smoothing 
the synthetic spectra including the small scale 
flux, so that the resolution is 10 A. The first 
minimum of the flux that falls below the cut- 
off flux is defined as the edge of the HII region 
and thus corresponds to a region where there 
is no or little flux on a scale of 10 A. 

This definition reflects the fact that the 



degree of ionization should be high in the 
immediate vicinity of the quasar, should de- 
crease moving away from the quasar, and then 
should increase again because of the general 
ionization of the universe. It is also similar 
to methods for determining the size of HII re- 
gions from observational spectra. 

The location of the minimum of the trans- 
mitted flux, which yields the size of the HII 
region, depends on the resolution of the spec- 
trum. To allow unbiased comparison between 
simulations and observations, the resolution 
of the synthetic spectra must match closely 
that of the observations. 

Because the HII regions are asphcrical, dif- 
ferent lines of sight yield different measures 
of its size. For each simulation, we con- 
struct synthetic spectra from three random 
lines of sight through each of the fifty bright- 
est quasars. The result is an ensemble of 
~ 150 spectra from each simulation. It is this 
ensemble of spectra for each simulation which 
we compare to observations in section 5. 

For example. Figure 11 shows one of the 
distributions of the size of the HII region as 
a function of the area under the flux versus 
wavelength curve. The area under the spec- 
trum indicates the degree to which the gas 
in the HII region is ionized. The distribution 
shown is for a simulation with case B clump- 
ing factors, 2h~^ Mpc cell size and 128'^ cells 
at z = 6.28. In general the distance to the 
minimum flux and the area under the spec- 
trum are two properties that can be observed 
and then compared to our simulations if the 
resolution is known. For example the data for 
SDSSJ1030-F0524 is also included in the flgure 
{black square). 

Figure 12 illustrates the measurement of 
the HII region size of SDSS 1030-^0524 from 
the high-resolution spectrum. The gray curve 
displays the spectrum smoothed to a spatial 
resolution corresponding to 10 A. The po- 
sition of the redshifted Lyman-a line {dash- 
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Fig. 11. — Distribution of HIT regions in a 
128^ simulation of case B, 2h^^ Mpc cell size 
and with infinite quasar lifetime (gray trian- 
gles correspond to the cutoff at F < 0.03 and 
black triangles corresponds to -F < 0.01), also 
indicated is the HII region size for the quasar 
SDSSJ10230+0524 {black square) 

dot) and two variations of determining the 
assumed edge of the HII region (dashed and 
dash- dot- dot- dot) are shown as vertical lines. 
The line at 8748 A shows the edge of the HII 
region using the smoothed flux and finding the 
first minimum that also lays below the thresh- 
old flux of 0.01. This measurement is the same 
as the method used on the simulation data 
and yields a size of 4.76 Mpc. 

The discrepancy of the area under the 
spectrum, apparent from Figure 11, for the 
quasars from our simulation and the SDSS 
quasar can be attributed to the much higher 
luminosity of the SDSS quasar. Higher lumi- 
nosity quasars will emit more photons, which 
not only increases the size, but also reduces 
the ionization fraction of the gas in the HII 
region. When comparing this graph to Fig- 
ure 8 it is clear that the transmitted flux near 
the quasar, corresponding to the amount of 
ionized material, is significantly larger for the 
observed quasar than the transmitted flux for 
any of our simulated ones. This means that 



Fig. 12.— Spectrum of SDSSJ1030+0524 
(from X.Fan) in the immediate surrounding 
of the Lyman-a line. The gray curve illus- 
trates the smoothed spectrum to a spatial res- 
olution corresponding to a 2h~^ Mpc cell-size. 
Shown as vertical lines are the wavelength 
of Lyman-a (dash-dotted) and two different 
measurements of the assumed edge of the HII 
region: minimum with F < 0.01 (dash-dot- 
dot-dot) and minimum for higher resolution 
the smoothed flux (dashed). 

the gas is more highly ionized around this 
quasar. 

Figure 13 displays the effects of simulation 
resolution on measuring the HII region sizes 
by comparing 2h~^ Mpc and Wh~^ Mpc cell 
size runs. Two distributions are rather sim- 
ilar, also we find fewer HII regions for lower 
resolution simulations, as would be expected. 

Figure 14 shows the dependence of the size 
of the HII regions on quasar lifetime. Of the 
three lifetimes shown, 10^, 10^ and 10^ years, 
the longest, 10^ years, gives results that are 
barely distinguishable from infinite lifetimes. 
For the shortest lifetime, however, one can see 
the effects of new quasars starting new HII re- 
gions. For longer lifetimes the quasar can emit 
photons into the IGM for longer time periods 
after ionizing its immediate high-density sur- 
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Fig. 13. — Comparison of two distribution of 
HII regions in 128^ simulations of case B with 
infinite quasar lifetime, one with 2/i"iMpc 
cell-size (^ra J/ triangles), the other one with 
10/t~-'^Mpc cell-size( 6ZacA; triangles). 



roundings, so that the HII regions for shorter 
lifetimes will be smaller. 
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Fig. 14. — Comparison of three distribution of 
HII regions in 128^ simulations of case B with 
2h~^ Mpc cell-sizeand with quasar lifetimes of 
10^ yrs {open triangles), 10^ yrs {gray squares) 
and 10^ yrs {black triangles). 

The lines of sight for the spectra start at 
the position of the quasar and extend in a ran- 
dom direction through the box, therefore we 



Fig. 15. — This Figure shows the irregularity 
of HII region sizes for a 128^ simulation with 
2/i^^ Mpc cell-size filled triangles) with 
the maximum size versus the average size from 
three lines of sights. The line corresponds to 
spherical HII regions. 

only probe one specific part of each HII region 
with each large-scale spectrum. Thus, using 
just one line of sight for measuring the size 
of the ionized region surrounding each quasar 
does not necessarily represent the actual HII 
region size. Consequently, casting three lines 
of sight from each quasar should help lower 
the chance of not detecting an HII region. 
We will cover different directions pointing out 
from the source so that even if the quasar is 
located at the edge of a very high density re- 
gion in the IGM, we should be able to detect 
the HII region in one of the three lines of sight 
cast. We used each line to measure the size of 
the HII region separately and so accumulated 
distributions for their sizes^. 

Figure 15 addresses the question of shape 
of the HII regions. The three lines of sight 
for each quasar yield a maximum and a mini- 
mum size, which are shown in filled triangles. 



^Because most of the HII regions are barely resolved 

even with 2h~^ Mpc resolution, using more than 3 lines 
of sight does not actually improve statistics. 
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The line corresponds to spherical HII regions, 
where the two parameters would be equal. 
The figure shows that almost none of the HII 
regions are spherical. Some of the spectra 
show a large discrepancy between maximum 
and minimum size, for a few we do not even 
detect the HII region in one of the directions 
within the limits of our numerical resolution. 
Thus, the measurement of HII region sizes 
from one line of sight of a quasar does not 
provide a complete picture of the volume con- 
tained in the HII region. 

One of the difficulties in determining the 
HII region size from this simulation is that 
our spatial resolution for the spectra can only 
be as small as the cell-size of the simulation 
box. This causes smoothing on scales of 2 to 
10h~^ Mpc, depending on the simulation run. 
This means that wc can easily miss or over- 
estimate the HII region size. If the quasar 
is positioned in an extremely high density re- 
gion, the ionized volume surrounding it can be 
too small to appear in a spectrum with such 
high resolution. Also, if the smoothing causes 
the edge of the HII region (set to be at the 
closest flux minimum to the quasar in concor- 
dance with the method used for the synthetic 
spectra) to disappear, the size of the ionized 
volume due to the quasar can be much smaller 
than determined by our method. 

5.5. Troughs in the Spectra 

In addition to analyzing the ionization of 
the IGM around luminous sources, we can also 
investigate the distribution of low flux regions, 
associated with higher neutral fraction, using 
the synthetic spectra. Regions in the sim- 
ulation volume with low ionization fraction 
along the line of sight will appear as troughs in 
the absorption spectrum. The extent of these 
troughs in the spectra and their positions in 
space as determined by their redshifts, give 
us information about structure of the IGM be- 
tween the quasar and the observer. Since only 



such a small amount of neutral gas is required 
to absorb almost all of the flux, the depth of 
the troughs is not used for analysis. 

For this analysis, we use the same sets of 
synthesized spectra as described in Section 3.1 
and scan them for regions with extremely low 
flux (below a threshold detectable by instru- 
ments, here set to 10^^). We then compile 
distributions of these troughs from the 150 
spectra obtained from each simulation and 
plot their distributions. A few examples of 
these distributions are shown in Figures 16 
through 17. These figures show only the 
largest troughs, which extend more than 10 A. 

We scanned the spectra for troughs be- 
tween ~ 9100 A (corresponding to z = 6.5) 
and ~ 7200 A (corresponding to z = 4.9). 
The most striking feature of this set of dis- 
tributions is their similarity. For example all 
the different runs display a decrease in average 
length of the low flux regions toward shorter 
wavelengths. This feature can be explained 
by the increase in the mean free path for ion- 
izing photons due to the reionization of the 
universe. 

Figure 16 illustrates the similarity of the 

distributions by taking a closer look at the 
trough distributions for two different clump- 
ing factor models (Case C in gray and Case B 
in black). The two distributions nearly com- 
pletely overlay on each other. The only visible 
difference is a small excess of large high red- 
shift troughs for case C and an excess for case 
B at lower redshifts (A ~ 7700). 

Another notable similarity is the fact that 
the largest troughs for each simulation are all 
about the same size, 100 A. Most of these 
long stretches of low flux in the spectra lie 
above 8500 A, with a few outliers in some of 
the simulations. For example Figure 16 shows 
two troughs of about 40 A centered at 7800 A 
for case B, which corresponds to a stretch of 
~ 3.5 Mpc at redshift 5.4 or lAh~^ Mpc in co- 
moving units. Considering that a trough only 
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appears in our distribution when the flux is 
zero, these two troughs are produced by very 
large regions of relatively neutral gas in the 
IGM. For example, Figure 8 shows a spectrum 
containing a large trough near z = 6.0 that ex- 
tends for 100 A, similar to the largest troughs 
shown in our distribution. 

Figure 17 shows that we can, similarly 
to the HII regions size calculations, compare 
our results for trough distributions from syn- 
thetic spectra with the quasar spectrum SDSS 
J1030+0524. Since the data of the observed 
spectrum contains noise, we have to set a level 
of uncertainty below which we assume that 
the transmitted flux is zero. The black trian- 
gles correspond to 3a as the threshold for no 
transmitted flux and the open squares corre- 
spond to a threshold of 5(J. The distribution 
for the troughs obtained from the simulation 
is shown in Figure 17 in gray triangles and 
the similarities of the observation to the sim- 
ulation are clearly visible. The 3a detection 
limit yields a better agreement with our data, 
but both values show the same decrease in 
large troughs toward lowed redshifts and sim- 
ilar sizes for the troughs on the whole redshift 
range. 

In contrast to measurements of HII region 
sizes, the comparison of resolution sizes of 
the simulation (Figure 18) does not reveal a 
strong dependence on resolution. The largest 
trough size for both cases is around 100 A 
at high redshift and both show a slow de- 
crease in trough size toward the lower red- 
shifts. In addition they both exhibit a few 
outliers of larger size as mentioned above. 
There is a slight excess of larger troughs for 
the 10h~^ Mpc simulation. This is due to the 
larger step size in the synthetic spectra which 
removes some of the small spikes in transmit- 
ted flux that intersect large troughs. 

Altogether it is clear that the differences for 
varying parameters of the simulations do not 
change the distribution of troughs in the spec- 



tra significantly. This invariance to parameter 
changes allows us to conclude that as long as 
we fit the observational data from SDSS with 
the initial conditions we will receive the same 
information about the Lyman-a forest. 
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Fig. 16. — Comparison of two distributions 
of troughs in 150 spectra for a 128^ simula- 
tion with 2h^^Mpc cell-size at z = 6.5, case 
B {black open triangles) and case C {filled gray 
triangles) 

5.6. Flux decrement 

In addition to looking at how the mean 
transmitted flux changes over time, the change 
in ionization of the IGM can also be observed 
when we consider how many pixels of the 
spectra fall below a certain flux at each red- 
shift. Figure 19 shows this "flux decrement" 
for 5 different levels: Fq < 0.3, Fq < 0.2, 
Fo < 0.1, Fo < 0.05 and Fq < 10~^. One 
can also observe that even at the low red- 
shift end, around z = 4.0, most of the pixels 
still have F < 0.3. The curve corresponding 
to lowest flux limit, which reflects to the re- 
gion in redshift where effectively all the flux 
is absorbed in neutral gas, displays a sharp 
boundary around z = 5.5, below which all 
pixels have more flux than 10~^. This means 
that after redshift 5.5, the universe is ion- 
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Fig. 17. — Distribution of troughs for a 128'^ 
simulation with 2h~^Mpc cell-size at z = 6.28 
{small gray triangles) with the troughs from 
SDSSJ1030+0524 overlayed for a three a de- 
tection {black triangles) and for a five a de- 
tection {open squares). The simulation spec- 
tra were smoothed to the spectral resolution 
of observations and noise was added to better 
compare it to the observations. 

ized enough to eliminate the Gunn-Petersen 
trough. 

It is also important to note that all the dis- 
tributions reach 1.0 toward higher redshifts, 
meaning that there is not a significant amount 
of flux transmitted in this regime. Thus we 
can identify the region in the spectra above 
z = 5.5 as the Gunn-Peterson trough. The 
change in transmitted flux appears to happen 
within a short period of time, supporting the 
idea that reionization happened quickly. 

6. Discussion and Conclusions 

Our cosmological simulations are used to 
model the large-scale effects of reionization 
including luminous high redshift quasars. A 
major issue with simulations of large volumes 
is that a large dynamic range needs to be cov- 
ered including cosmic structures on a diverse 
range of scales. To solve this problem, we 



Fig. 18. — Distribution of troughs for two 
128^ simulations, one with 2h~^ Mpc cell- 
size {filled black triangles) and one with 
lO/i"-'^ Mpc {gray diamonds). This compari- 
son of resolutions shows no significant differ- 
ence between the two cell sizes. 

have used clumping factors to approximate 
the small scale structure of the cosmic gas. 
These clumping factors are derived from small 
volume simulations that can simulate struc- 
tures on kpc-scale and then used to produce 
results on Gpc scales with resolution up to 
10h~^ Mpc. We developed three models for 
computing the clumping factors: 1) case A, 
where clumping factors are weighted by vol- 
ume, thus including local absorption and ion- 
ization, 2) case B, where the clumping fac- 
tors are weighted by the inverse of the density 
therefore lowering the influence of high den- 
sity regions and 3) case C, where we disregard 
all volume elements with high gas density and 
with that all absorption and ionization in high 
density regions. All three cases yield similar 
results and can fit the mean transmitted flux 
as observed by SDSS. 

We use these simulations to construct syn- 
thetic spectra of quasars comparable to ob- 
served spectra of SDSS quasars at high red- 
shifts. These synthetic spectra are used to in- 
vestigate HII region sizes of bright quasars in 
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Fig. 19. — Fraction of pixels with transmit- 
ted flux Fq < 0.3 (continuous), Fq < 0.2 (dot- 
dashed), Fq < 0.1 (dot- dot- dashed), Fq < 0.05 
(short dashed) and Fq < 10~^ (long dashed). 
The rapid increase in transmitted flux is ap- 
parent when the last percent of the neutral 
gas is ionized. 

our simulations and to compare with HII re- 
gions of observed quasars. We found a signif- 
icant scatter in the distribution for the sizes, 
even when looking at a large number of quasar 
HII regions. Our investigation shows that 
there is a dependence of HII region size on 
quasar lifetime (Figure 14), with larger HII 
regions for longer quasar lifetimes. However, 
this relation is too weak to put constraints 
of quasar lifetimes by observing their HII re- 
gions. 

The synthetic spectra can be used to make 
detailed comparison between the observa- 
tional data and the simulations. In this first, 
introductory paper, we limit this compari- 
son only to the distribution of HII regions 
around bright quasars and the distribution of 
throughs (regions with no flux) in the spectra. 
In the future work we plan to use synthetic 
spectra for a more detailed and elaborate com- 
parison with the observational data. 

Future improvements to our method will 



include, as has been mentioned in § 2.2, an 
account for the scatter in the modeled clump- 
ing factors. As we have discussed above, the 
clumping factors on scales we are interested in 
(several comoving Mpc) are not deterministic 
functions of the gas properties, and must be 
treated statistically. 

In addition, since our box volumes are still 
relatively small and most quasars found in 
our simulation boxes are still too faint to be 
detected by even the best telescopes, con- 
strained simulations can be used to model HII 
regions around the quasars comparable in lu- 
minosity to those observed by the SDSS. 

This weak dependence of the simulation re- 
sults on the particular choice of clumping fac- 
tors is the main justification of our approach. 
It demonstrates that, as long as ionization and 
recombinations are counted sclf-consistently, 
the specific form of spatial averaging used to 
define clumping factors is not too important. 
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